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PREAMBLE 


This personal reaction is written from multiple perspectives. First and foremost, as the corresponding 
author of the original FAIR article. Second as the chair of the first High Level Expert Group (HLEG) of 
European Open Science Cloud (EOSC) (which is how | met Jean-Claude) and third from my current GO 
FAIR and CODATA perspective. None of what | write below is to be seen as a formal position of any of 
the organisations | am associated with. 


Let me start by stating that, after some periods silent of hope and of deep despair, | now strongly feel 
that, with the governance of the EOSC Association in place, EOSC will become a success after all. It will 
still be critical that the Association involves the member states (MSs) and actual researchers in an agile and 
non-bureaucratic manner, for which we need bottom-up mechanisms such as operated by the Research 
Data Alliance (RDA) and GO FAIR. But a balancing formal entity operating along the formalised Strategic 
Research and Innovation Agenda [1] and the Partnership proposal as well as the various “declarations” 
including the recent one under the German presidency [2] are an excellent guiding roadmap to a successful 
EOSC, obviously in global context. 


That said, at the risk of sounding like broken record, this reaction should also look at the points where 
it went “almost” wrong, as we should try and learn from our mistakes. | may make some enemies—or 
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strengthen the opinion of existing ones—in the process, but then, a wise old friend, who also wrote one 
of the reactions once told me: “Barend, unless you made some enemies you probably lived in vain.” So 
| will speak my mind (“what’s new’?). | also like to say that “EOSC” brought me some real new friends 
for life! 


First of all, the fact that quickly after its inception FAIR became a hype term®, which was probably partly 
even accelerated by the prominent role it played in early EOSC discussions with EC’s Director General, 
also has its downsides. Like for the term “Al”, everyone co-opts the term and some start watering the 
concept down to a bloodless caricature from what it originally meant. In the case of FAIR this includes 
removing the central notion of machine actionability, mis-characterising it as a standard, conflating it with 
“open”, only linking it to data sensu stricto, ignoring software, algorithms and more. In general terms, 
people that sometimes seem to have never read the original article [3], the most flagrant abuse of the term 
| have heard (obviously not from an active researcher) is this: “/f data are Findable, Accessible and 
Interoperable it is ‘automatically’ Reusable.” This is of course “swearing in FAIR church” as the R (principles 
R1-3) [3] clearly state that rich provenance and reuse conditions are critical and in particular the provenance. 
The decision whether (even high quality) data are fit for purpose (reuse in a particular study) is a critical 
step and is imho (in my humble opinion) at the basis of the reproducibility problem we currently face. 
Therefore, | would like to re-emphaisize here my current one liner to summarise the aim of the FAIR guiding 
principles: “The Machine Knows what I mean”. Those who feel that FAIR is too ambitious and for instance 
promote that “achieving F and A is enough for now” in my humble opinion fail to see the disruptive 
character of the solutions we need to make EOSC and its sister around the globe a real paradigm shift 
towards Open Science (OS). Or they are just trying to preserve the status quo and move incrementally at 
a pace they can follow. 


This nicely bridges to the first observation on EOSC as such. | indeed think that the first “Communication” 
that needed 126 iterations mentioned by Jean-Claude, which happened in the same time frame as our 
“HLEG-1” period, was symptomatic for a basic flaw in the discussions, which haunts us still today. Conflating 
the “ICT’/HPC (or basic e-infrastructure) with the data and end user applications for analytics, has caused 
an enormous hurdle. In the entire journey of the HLEG we had to carefully navigate around this cliff and 
it is still a highly controversial issue today. This part was the “Dunning Kruger effect” [4] pur sang: The 
“other side is easy” (because | am not hindered by any knowledge about it) and is “more or less already 
done” (because | do not understand the complexity). This is not only true for the active researchers who 
cannot use the current e-infrastructure efficiently (and naturally that is “entirely the fault of the nerds who 
build things | do not understand or cannot operate”), but also for e-infrastructure engineers who know 
everything about ICT and “thus” (?) also about data (because “that is just ones and zeros”) as Jean- Claude 
also noted. | also believe however, that it is a mistake to completely separate e-infrastructure for the data 
and services layer, as the e-infrastructure should route (and understand at least at middleware level) what 
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processes are needed on the data and how the FAIR services “run”. Nowadays (after many iterations) | use 
the diagram below (Figure 1) to explain that all three basic elements of the “Internet of FAIR Data and 
Services” are needed. Each of them should be adorned with FAIR (machine actionable) metadata to 
seamlessly form a Web of FAIR Data and Services on top of the current, proven Internet backbones, thus 
forming the “Internet of FAIR Data and Services”, eventually creating an “Internet for Social Machines” [5] 
where people and machines can both efficiently use all services, independently and in collaboration. 


FAIR metadata 
e-infrastrucure 


Figure 1. The Internet for social machines. Note: The final aim: The Internet for Social Machines, which enables 
seamless collaboration of people and computers should be based on minimal, but rigorously required protocols 
and agreements. The current infrastructure that supports the Internet and the Web applications we know, should 
be reused as much as possible, including its basic operation on TCP/IP and domain names. What needs to be 
added to realise the Internet for FAIR Data and Services on top of the current Internet is a Web of FAIR Data and 
Services. Applications (regardless of whether they work under parental guidance of people or not) should be able 
to Find, Access, Interoperate and (if relevant) Reuse data (and associated applications). Increasingly (virtual) 
machines will operate largely independently from direct human interaction and therefore two basic elements are 
absolutely critical to make this all happen (comparable to the centrality of TCP/IP): (1) All elements (including 
the e-infrastructure) should be adorned with rich and machine readable (FAIR compliant) META-data; and (2) All 
elements of the Web of FAIR Data and Services should be composed of FAIR Digital Objects (FDOs) [6,7,8]. 


This does absolutely not mean that the foundation (e-infrastructure) of the triangle is “trivial” or “can be 
reused as is”. Not only middleware, but also the crucial and fundamental concept of FDOs needs to be 
developed in close collaboration between data and computer experts and is largely domain-agnostic. 


The seamless combination will become the principle “package” of information that machines (and also 
people) can understand and act upon. Major infrastructure builders should actually co-lead this, while 
domain scientists need to decide on which data formats and metadata schemes (i.e., FAIR Implementation 
Profiles [9]) should be built on this basic schema. 
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“EOSC IS A BIGGER ME” 


Together with the Dunning Kruger effect, too many overlapping and redundant projects supporting the 
talking/meeting/landscaping, re-landscaping and re-re landscaping has resulted in what I became to call 
the “EOSC is a bigger Me syndrome”. On the one hand, countless people voluntarily invested (and still 
invest) their time in the development of the EOSC, but others seem to only see EOSC as yet another way 
to collect EC funding for their current solutions that are in my opinion not future- and OS proof. This 
misbalance between people investing their own time and effort based on intrinsic motivation and vision 
and on the other hand the “reliance on EC subsidy” caused a dichotomy during the scoping years of EOSC 
between disruptive and “preservative” approaches. The heavy reliance on EC subsidy also largely ignored 
the subsidiarity principle [10] and the fact that 90% of the eventual infrastructures and services that we 
need for EOSC will be paid by the MSs. Also data and research intensive industry was largely kept out of 
the loop, which was another mistake | have frequently pointed out. This helped to create and sustain the 
“Brussels Bubble” that Jean-Claude described. The Association will hopefully reverse that trend. 


Finally, the influence on the HLEG report of the then-commissioner was rather profound. The report was 
not only delayed almost 6 months after its proposed publication version, but there is also a nice additional 
“untold story” here: The originally proposed title of the report was: “A Cloud on the 2020 Horizon’. In 
my original foreword | explained the slightly “glooming” connotation of that title. When the report was 
finally approved, it appeared that the title had been unilaterally changed into “Realising A European Open 
Science Cloud [11]”. Not only did I have to hastily change my foreword (because it made no sense anymore) 
but also, my notorious statement that the “result” should neither be “European” (only), nor Open (only) nor 
(only) for Science and certainly not (just) a “Cloud” was entirely ignored in changing that title. But it again 
emphasises the “This is an EC thing” context, with the associated risk for confiscation of the concept by 
the “usual suspects” in EC subsidy land. However, | feel after three years of intensive deliberations, which 
may be considered lightning fast on the geological time scale, see George’s reaction, we can conclude that 
most of the original HLEG recommendations are well-represented in the basic guiding documents of the 
EOSC Association, which makes me a happy man at the end of this crazy year. 


That leads me to the final observation: As a result of the (quote from Jean-Claude): “non-paper seen as 
the political turning point in support of EOSC” [12], GO FAIR (Global Open FAIR) [13] was started, 
originally by Germany and The Netherlands and soon joined by France as a temporary “kick-start”, 
bottom-up approach to accelerate EOSC (see also recommendation |-2.1 in the HLEG report, Appendix A). 


Soon, GO FAIR became really global and the agile modus operandi of practical Implementation Networks 
yielded a number of crucial approaches to speed up the adoption of the FAIR guiding principles and the 
hourglass approach [14]. Now, late 2020, when the EOSC Association is a fact, GO FAIR (1.0) has achieved 
its goals (early implementation steps) and we need to reflect on its future. Next to the intrinsic value of the 
active GO FAIR IN community [15] as such, several particular assets that | need to mention here are the 
development of the FAIR Implementation Profile and Metadata4Machines approach, the development of 
easy to install FAIR data points for open, FAIR metadata publication and indexing, and last but not least 
the international effort (involving many players, also outside the direct GO FAIR initiative) to develop the 
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minimal specs of the FDO framework [7] in a more specified form than when coined in the FAIR expert 
group report [5]. These assets (all open source and open access) can be carried over, not only to EOSC, 
but will have much wider, international, impact most likely leading to a continuation of GO FAIR (2.0) 
beyond its original time scope, namely three years, the predicted time it would take to complete the 
international policy and bureaucracy process to reach the status of a formal association as we have today. 
| hope the leaders of the Association will optimally learn from the successes and failures and near-road- 
accidents of the last three years and see EOSC as the European contribution to a “Global Open Science 
Commons”, also known as the Internet of FAIR Data and Services, in full, open collaboration with the 
international organisations that are now joining forces in the Data Together initiative [16]. After all, the 
major challenges we face are global, so is the research needed to face them and so are the solutions we 
hope to fiend. | fully trust the current leadership of the association to make that vision reality. 
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APPENDIX A: (from HLEG 1) CHALLENGES AND GENERAL OBSERVATIONS 


The majority of the challenges to reach a functional EOSC are social rather than technical. 

The major technical challenge is the complexity of the data and analytics procedures across disciplines 
rather than the size of the data per se. 

There is an alarming shortage of data experts both globally and in the European Union. 

This is partly based on an archaic reward and funding system for science and innovation, sustaining 
the article culture and preventing effective data publishing and reuse. 

The lack of core intermediary expertise has created a chasm between e-infrastructure providers and 
scientific domain specialists. 

Despite the success of the European Strategy Forum on Research Infrastructures (ESFRI), fragmentation 
across domains still produces repetitive and isolated solutions. 

The short and dispersed funding cycles of core research and e-infrastructures are not fit for the purpose 
of regulating and making effective use of global scientific data. 

Ever larger distributed data sets are increasingly immobile (e.g., for sheer size and privacy reasons) 
and centralised HPC alone is insufficient to support critically federated and distributed meta-analysis 
and learning. 

Notwithstanding the challenges, the components needed to create a first generation EOSC are largely 
there but they are lost in fragmentation and spread over 28 MSs and across different communities. 
There is no dedicated and mandated effort or instrument to coordinate EOSC-type activities across 
MSs. 


APPENDIX B: KEY FACTORS FOR THE EFFECTIVE DEVELOPMENT OF THE EOSC AS PART OF OS 
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New modes of scholarly communication (with emphasis on machine actionability) need to be 
implemented. 

Modern reward and recognition practices need to support data sharing and re-use. 

Core data experts need to be trained and their career perspective significantly improved. 
Innovative, fit for purpose funding schemes are needed to support sustainable underpinning 
infrastructures and core resources. 

A real stimulus of multi-disciplinary collaboration requires specific measures in terms of review, 
funding and infrastructure. 

The transition from scientific insights towards innovation needs a dedicated support policy. 

The EOSC needs to be developed as a data infrastructure commons, that is an eco-system of 
infrastructures. 

Where possible, the EOSC should enable automation of data processing and thus machine actionability 
is key. 

Lightweight but internationally effective guiding governance should be developed. 

Key performance indicators should be developed for the EOSC. 
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APPENDIX C: SPECIFIC RECOMMENDATIONS TO THE COMMISSION FOR A PREPARATORY 
PHASE 


Policy recommendations 

e P1: Take immediate, affirmative action on the EOSC in close concert with MSs. 

e P2: Close discussions about the “perceived need”. 

e P3: Build on existing capacity and expertise where possible. 

e P4: Frame the EOSC as the EU contribution to an Internet of FAIR Data and Services underpinned 
with open protocols. 


Governance recommendations 

e G1: Aim at the lightest possible, internationally effective governance. 

e G2: Guidance only where guidance is due (this relates to technical issues, best practices and social 
change). 

e G3: Define Rules of Engagement for service provision in the EOSC. 

e G4: Federate the gems and amplify good practice. 


Implementation recommendations 

e 11: Turn the HLEG report into a high-level guide to scope and guide the EOSC initiative. 

e |2: Develop, endorse and implement the Rules of Engagement for the EOSC. 

e 12.1: Set initial guiding principles to kick-start the initiative as quickly as possible. 

e 13: Fund a concerted effort to develop core data expertise in Europe. 

e 14: Develop a concrete plan for the architecture of data interoperability of the EOSC. 

e 15: Install an innovative guided funding scheme for the preparatory phase. 

e 16: Make adequate data stewardship mandatory for all research proposals. 

e 17: Provide a clear operational timeline to deal with the early preparatory phase of the EOSC. 
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